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(54) Title: ERROR CONCEALMENT IN DIGITAL AUDIO RECEIVER 
(57) Abstract 

A digital audio receiver stores received frames temporarily for 
decoding and error concealment A reconstructing block (14) in the 
decoder reads stored frames using a read window (43) wherein the 
latest received frame (+cnnxt) is undccoded. Decoding is carried out 
in stages so that the conccmess of the current frame (0) is examined 
and possible enrors are concealed using corresponding data of other 
frames in the window. Detection of eirore is based on checksums 
(19, 26) and allowed values of bit combinations in certain parts of 
the frame, In addition, the receiver maintains an estimate (60) for the 
signal's bit error ratio and uses it to control the operation of the error 
concealment algorithm. 
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Error concealmeat in digital audio receiver 

The invention relates in. general to detection and concealment of errors in a signal 
transmitted in digital form from a transmitter to a receiver. In particular the 
invention relates to detection and concealment of transmission errors in an audio 
5 signal processed in the form of frames, by a digital audio receiver. 

Transmission of an audio signal in digital form from a transmitter to a receiver is 
known as such and it is going be become more common as digital television and 
broadcasting systems replace older systems based on analog frequency modulation. 
Known telecommunications standards dealing with the transmission of digital audio 
10 signals include the ETS 300 401 standard by the European Broadcasting Union 
(EBU) and European Telecommunications Standards Institute (ETSI) and the 
ISO/IEC 11172-3 and ISO/IEC 13818-3 standards by the International Standard 
Organization (ISO) and International Electrotechnical Commission (lEC). These 
standards specify a certain frame structure for the transmission of a digital audio 
15 signal. The ETS 300 401 standard, which is also called the DAB (Digital Audio 
Broadcasting) standard, specifies a frame structure which in a way is a special case 
of the frame structure specified in the ISO/IEC 11172-3 and ISO/IEC 13818-3 
standards as it contains additional specifications concerning frame structure 
particulars left open in the earlier standards. With an audio signal sampling 
frequency of 48 kHz the DAB standard is based on the ISO/IEC 1II72-3 standard 
and with a sampling frequency of 24 kHz on the ISO/IEC 13818-3 standard. To 
illustrate the background of the invention, the structure of the audio frame according 
to the aforementioned standards and its processing in transmitter and receiver 
apparatuses is described in brief below. 

Fig. 1 is a simplified block diagram of an apparatus 1 according to the ISO/IEC 
1 1 172-3 and 13818-3 Layer II standards generating DAB flames from a pulse-code- 
modulated (PCM) audio signal. The apparatus comprises an input port 2, output port 
3, and between them, a filter bank 4, quantising and coding block 5, and a frame 
generating block 6, connected in series, hi parallel with the filter bank 4, there is a 
psychoacoustic model block 7 the input signal of which is the aime as the filter 
bank input signal. The outputs of blocks 4 and 7 are taken to a bit allocation block 8 
the output of which controls quantising and coding in block 5. The apparatus also 
comprises a data port 9 such that digital program associated data brought thereto is 
taken to the frame generating block 6 which incorporates the program associated 
data in the frame structure. 
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Fig. 2 IS a simplified block diagram of an apparatus 10 according to the ISOAEC 
11172-3 and 13818-3 Layer II standards decoding the frames generated by the 
transmitter shown in Fig. 1 into a pulse-code-modulated audio signal. It comprises 
an input port 11, output port 12, and between them, a frame decoding block 13, 
reconstructing block 14 and an inverse filter bank 15, connected in series. The framJ 
decoding block 13 is also connected with a data port 16 to take program associated 
data to other circuits of the receiver apparatus. 

The audio signal is transmitted as frames between apparatuses according to Figs. 1 
and 2. The amount of data in a single frame corresponds to a 24- or 48-ms-long 
audio signal part, hi addition to audio data proper the frame contains header 
information, checksums, information related to the processing of audio data, and 
program associated data. PAD. Since transmission paths are not ideal, errors may 
occur in the contents of the frames which affect the operation of the receiver in 
different ways depending on the location of the error in the frame. 

Fig. 3 shows the structure of an audio frame 1 7 according to the DAB standard. The 
frame comprises an integer number of eight-bit bytes (not shown). It starts with a 
32-bit header 18, followed by a 16-bit CRC word 19. The length of the bit 
allocation part 20 is 26 to 176 bits depending on the audio mode (single chamiel. 
dual channel, stereo, joint stereo) and sampHng frequency used as well as on the bit 
rate used for transmitting the audio program. An SCFSI part contains instructions 
for the interpretation of Ihe scale factor part 22 following it. The scale factors in the 
latter provide information about how the various parts of the signal were 
emphasised at the frame generation stage. Each scale factor is represented by a six- 
bit codeword (not shown) and the number of codewords in the fnime varies 
according to how much variation Ihere is in the different parts of the audio signal 
during the period represented by the frame. Part 23 contains the sampled values 
proper which represent the sampled audio signal. If the bits representing the 
sampled values do not fill the length of the space reserved for them, the empty part 
is filled with padding bits 24. 

There are in the end of the frame 17. from right to left in ihe Figure, a fixed program 
associated data (F-PAD) field 25, scale factor cyclic redundancy check (SCF CRC) 
error protection 26 for the audio data, and an extended program associated data (X- 
PAD) field 27. The latter is not necessarily included in eveiy audio frame In 
accordance with the ETS 300 401 standard, the program associated data fields 25 
and 27 are intended for the transmission of data that are closely related to the audio 
data proper included in the frame and that may have synchronisation requirements 
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concerning the audio data. Their use is not mandatory. The F-PAD and X-PAD 
fields together form the program associated data (PAD) part. The F-PAD field 
particularly includes a two-bit X-PAD indicator (not shown) to indicate whether the 
frame includes an X-PAD field and if so, whether it is a four-byte, so-called short 
5 X-PAD field or a variable size X-PAD field. 

Fig. 4 shows in more detail an audio fi-ame header 18 the length of which is 32 bits 
(four bytes). The description to follow concerns both the ISO/IEC 11172-3 and 
ISO/IEC 13818-3 standards and the DAB standard so that the specifications 
required by the DAB standard are mentioned separately. The first twelve bits form a 
10 ^chronisation word 29 in which all bits are ones. The next bit 30 is a so-called ID 
bit wherein value "I" corresponds to the application of the ISO/IEC 11172-3 
standard and value "0" corresponds to the apphcation of the ISO/IEC DIS 13818-3 
standard in the audio signal processing. The length of the Layer field 31 is two bits 
and its value corresponds to the layer of the ISO/IEC 1 1 172-3 standard in use. The 
15 DAB standard allows values "10" (Layer II) and "00" (reserved for fixture expans- 
ion). The protection bit 32 indicates whether there is a checksum in the frame, and 
its value according to the DAB standard is "0", meaning a checksum is used. The 
next four-bit field 33 represents the bit rate of the audio program in use. The 
ISO/IEC 11172-3 and ISO/IEC 13818-3 standards do not allow the value "1111" in 
20 the field 33. Furthermore, the DAB standard does not allow the value "0000". The 
sampling frequency field 34 includes two bits representing the sampling frequency 
of the original pulse-code-modulated signal. According to the DAB standard, values 
"00" and "01" are not allowed in this field 34. Value "01" corresponds to a 48-kHz 
sampling frequency if the ID bit is "1", and to a 24-kHz sampling frequency if the 
ID bit is "0". Value " 1 1" is reserved for fiiture expansion. A padding indicator bit 35 
is "0" according to the DAB standard because there are no padding bits in the audio 
fi^e formed from a 48-kHz or 24-kHz PCM signal. According to the ISO/IEC 
11172-3 and ISO/IEC 13818-3 standards, bit 35 is "I" if there are padding bits in 
the audio &ame. The Private bit 36, which is reserved for private use, has no 
significance according to the DAB, ISO/IEC 11172-3 and ISO/IEC 13818-3 
standards. 

A two-bit field 37 indicates the audio program's transmission mode which can be 
stereo ("00"), joint stereo ("01"), dual channel ("10") or single channel ("11"). The 
joint stereo mode in accordance with flie DAB standard is also known as "intensity 
stereo". At sampling frequency of 48 kHz, the values of fields 37 and 33 correlate 
such that only the following combinations are allowed: 
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15 



20 



bit rate f]ch\t/Q\ 


modes allowed 


field 33 value 


field 37 value 


32, 48, 56. 80 


single channel 


"0001", "0010", 
"0011", "0101" 


"11" 


224, 256, 320, 384 


stereo, joint stereo, 
dual channel 


"1011", "1100", 
"1101", "1110" 


"00", "01", "10" 


64, 96, 112. 128, 
160, 192 


all modes 


"0100", "0110", 
"0111", "1000", 
"1001", "1010" 


all values 



25 



for 24' SJ^""'^^ ^'''"'"'^ ^ "^"^'^ "^'^^''^ '^^^ ^^^•fi^'J 

The mode field extension 38. the length of which is two bits as well, is significant 
according to the DAB standard only if the mode field value is "01" i e Z LZ 
stereo mode is in use^ Then the value of the extension field 38 indicates according 

"''"^'^ - -*«-ty St J 

mode. "Die followmg copyright bit 39 is "0" if the audio program transmitted is not 

Value 1 of the copy bit 40 mdicates that the program transmitted is an original 

elf If" ~ ^ -Py- The value of^e 

emphasis field 41 corresponds according to the ISO/IEC II 172-3 standard to die 
emphasis used in the coding of the progi^. The DAB standard does not allow 
emphasis, so according to the DAB standard, the value of the field 41 is always 

T^n/fp7?,'r.? "^"^^^"^ ISO/IEC 11172-3 or 

f . I. 1 T^"" "^^'""^^ ^'^^'^ the original pulse-code-modulated 
ignd in to 32 subbands (cf filter bank 4 m Fig. 1). For one frame, the encoder reads 
36 samples fi-om each subband and arranges them into three 12-sampIe groups For 
each ^oup the encoder determines a scale factor, or a coefficient for normalising 
the subbands for transmission. The mutual relationship of the magnitude of the 
^oup ^ale factors detennines whether the encoder includes all three scale factors 

cat f ^7"^"' "^'^^^ ("^) identicabiess of the 

e factors by mcludmg m the frame only one or two scale factor.. The number of 
scale factors per particular subband is represented by a subband specific SCFSI 
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parameter, to which a reference was made above in the description of Fig. 3. For 
each scale factor there is in the frame scale factor part a six-bit codeword, allowing 
values "000000'- through "111110". 

The encoder of the transmitting apparatus continually monitors the frequency 
5 spectrum of the audio signal encoded and compares it with a so-called 
psychoacoustic model on the-basis of which it divides the limited number of bits 
coming to each frame among the subbands. This so-called bit allocation procedure 
. reserves the most bits for those parts of the signal that are the most important for the 
auditory impression. The same procedure determines the number of quantising 

10 levels for each subband. The least significant subbands are allocated no bits at all in 
the frame, so their number of quantising levels is zero. On other subbands, allowed 
numbers of quantising levels comprise 16 integers. At the sampling frequency of 
48 kHz, the smallest number is 0 and the greatest, 65,535, except for the slow bit 
rate (32 or 48 kbit/s) modes where the maximum number of levels on the two most 

15 significant subbands is 32,767 and on the following six subbands, 127. In the slow 
bit rate modes, the frame includes the samples of only the eight most significant 
subbands (subbands 0 to 7), In other modes, the frame includes the samples of the 
27 most significant subbands (subbands 0 to 26). At the sampling frequency of 
24 kHz, the maximum number of quantising levels is for the four first subbands 
20 16,383, on the next seven subbands, 127, and on the following nineteen subbands, 
9, and on the two least significant subbands, 0. 

To encode the samples, each sample is divided by the scale factor associated with it 
and a codeword is formed from the result according to a mapping operation defined 
in the standards. Each codeword comprises at least 3 and at most 16 bits, depending 

25 on the number of quantising levels. On subbands to which the bit allocation 
procedure assigned three, five or nine quantising levels, three successive samples 
constitute a granule, represented by a common codeword. Its maximum allowed 
value in the case of three quantising levels is 26, in the case of five quantising levels 
124, and in the case of nine quantising levels 728. The mapping operation used in 

30 the codeword generation is chosen such that die codeword caimot comprise ones 
only. This is to prevent the mixing up in the receiving apparatus of codewords and 
the synchronisation word "111111111111" located in the beginning of the frame. 

In flie digital transmission of audio signal according to the prior art, detection of 
errors and the resulting error conceahnent attempts are based on the use of check- 
35 sums. In accordance with the above, the audio frame according to the ISO/IEC 
11172-3 and ISO/IEC 13818-3 standards has one checksum field (reference 
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des,p,a,or 19 in Kg. 3) a.d me audio frame according to a,e DAB standard has 
r^a .6^ r«rr?''"'""" ""'^^'"'^^ " «^ 3). -n^ fonner 

™n as fte b« allocadon part (reference designator 20 m fig. 3) and the SCFSI part 

ttTv -verage area and if it does no. 

the checksum m the received frame, a transmission error is detected in the 

^oZ'r '° -""^ ^^ks™ field in fte end of the flame 

^spends to an ovenUl bit rate of a. leas, 56 Icbit/s in the single channel mode 
and at 112 kbtt/s m the other modes) have the s«Ue factors protected by fonr 
s^te CRC d„^W .he first of which (ScF-CRCO) covers subbands 0 through 
from 8^ (^^ff '). subb^tds fi™ 4 to 7, the third (ScF-CRC2), subbanS 
from 8 to 15, and fte fourth of which (ScF^RC3) covers subbaiKis 16 through 26 
to .nodes where the channel specific bit rate is below 56 kbit/s, the scale ftctoVs are 
protected by ^vo CRC checksums, the fh« (ScF^CO) covering subbands 0 o" 
and second ScF-CRCl) covering subbands 4 to 7. At the sampling ft.,uenoy of 

^ V ''"*^' ^'"""^ ^ four.separate CRC checksums fte 

firs, of wluch (ScF-CRCO) covers subbands 0 through 3, the second (ScF-CRCl) 
subban^ from 4 to 7. the third (ScF-CRC2). subbands fi.m 8 to ,5. and the fornix 
1 (SCF-CRC3) covers subbands 16 through 29. Lest the positions of the fa^ 
and second checksums be changed according to fte bit rate, fte checksums are 

48 Wz «,d 24 fflz, checksum ScF-CRC3 is the first, reading fl^m the beginning of 
dre frame, and checksum ScF-CRCO is the hts^ readmg from the beginn^g oTfte 
frame to fte case of the lower bit r«e of 48 kHz. checksum ScF-CRoTae first 

tota^er Theoo, "'T' comes 
ttereafter. The polynomial generatmg all the CRC checksums protecting dre scale 

17 and each of them covL the L« ™1 

s.gmf,cant b,ts of the scale factors according to flrc aforementioned groupmg The 
receiver uses the same polynomial to calculate die CRC checksums for the moa 
sr^fican. b,^ of the scale factors and if any one of Uiem does not eq^Ie 
checksum ra the received frame, a transmission emr is detected in the frame 
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The aforementioned standards ETS 300 410, ISO/IEC 11172-3 and ISO/IEC 
13818-3 do not specify a mandatory model of operation according to which the 
receiver should respond to transmission errors it detects in received audio frames. 
However, various operating model alternatives are known from recommendatory 
parts of the standards and from other telecommunications technology. In digital 
mobile phone technology, where the voice signal is transmitted in fi^es, it is usual 
that a receiver will not reproduce an audio part conveyed by a frame that was 
detected erroneous but mutes the sound reproduction unit totally for a moment or 
replaces the rejected frame with noise. Another option is that instead of the 
eiToneous frame the receiver re-plays the preceding error-free frame. Since, 
however, the audio technology' according tHs patent application aunTat sound 
reproduction of substantially better quality than that of telephone technology, 
automatic muting or substitution of a whole frame would degrade the auditoiy 
impression too much. 

15 Another disadvantage of the prior art is that checksums are not a 100% reliable 
method to detect all transmission errors. If several errors occur in one and the same 
frame, it is possible that their effect on the checksum is equal but in the opposite 
direction so that the checksum appears correct in spite of the errors in the frame. 

An object of this invention is to provide a method and equipment with which 
:0 detection and conceahnent of errors are performed in the reception of a digital audio 
signal more reliably than in the prior-art solutions. Another object of the invention 
is to provide a mediod and equipment suitable for digital audio reception with which 
the conceahnent of tiansmission errors distorts only a little the auditory impression 
of a reproduced sound. 

^5 The objects of the invention are achieved by observing in the decoding and error 
conceahnent units of the receiver several successive frames and arranging their 
decoding and the audio signal reconstruction in a suitable manner. 

The method according to the invention is characterised that it comprises stages 
wherein 

- several successive frames are stored in memory, 

- one frame stored in memory is chosen as the current frame, 

- the current frame is examined for eirors, and 

- errors detected in the current frame are concealed using the contents of other 
stored frames. 
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The invention is also directed to a decoding apparatus to realise the method 
according to the invention. The apparatus according to the invention is characterised 
in that the reconstructing block in it comprises 

- a table for the temporary storing of frames, 

5 - read and write means to write frames to. said table and read frames from it in 
windows, 

■ means for verifying the integrity of a frame included in the window read, and 

- means for replacing erroneous values in the current frame with values obtained 
from other frames in the window, 

10 

The method according to the invention aims at a balanced solution in which the 
optimal transmission error detection and conceahnent level is achieved using 
reasonable computing capacity. The receiver receives and stores several successive 
frames which, when stored, form a certain frame table. To read the table, the 

15 receiver uses a certain wmdow the magnitude of which is an integer number of 
frames greater than zero and which covers at least the current frame. In a preferred 
embodiment, the window also covers at least one frame received prior to the current 
- frame and at least one frame received after the current frame. Decoding of frames in 
the window area is perfonned in stages. The latest frame arriving in the window 

20 area is first decoded until its scale factors are found out Then the receiver conceals 
possible errors found in the scale factors of the current frame. In the concealment, it 
utilises scale factors of other frames in the window area. Next, the receiver 
continues decoding the latest frame until its samples are dequantised but not yet 
scaled. After that, the receiver uses frames in the window area in order to conceal 

25 errors that it may have found in the imscaled samples of the current frame. Only 
then are the samples of the current frame scaled and by means of inverse filtering a 
PCM signal is generated, which is taken to the output port of the decoder. 

Having processed one frame the receiver moves the observation window one frame 
forward with respect to the fi^e table, whereafter the fi^mc decoding described 
30 above starts over again. The method according to the invention is very suitable for 
parallel processing as the reception of new frames, their storing in the frame table, 
detection and conceahnent of errors in the current frame, the inverse filtering of the 
corrected frame and writing to the output data flow can be separate, parallely 
functioning parts. 

35 In the method according to the invention, detection of errors is based both on the 
use of checksums and on the use of so-called fundamental sets of allowed values. 
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The latter means that if the receiver detects in a certain part of a received frame a bit 
combination which is not a combination allowed for that part of the frame, as 
specified by the standards, it assumes that there is a transmission error in that 
particular part. For both the scale factors and samples, the receiver tries to replace 
5 ±e values assumed erroneous with correct values found in the nearest possible 
frame. Only in a situation where correct replacement values cannot be found in the 
whole observation window area is the total or partial muting of the reproduced 
signal used as a means to conceal the erroneous part. 

Size of the observation window may in one preferred embodiment of the invention 
10 be a dynamically variable parameter so that the method is adapted to different 
conditions causing transmission errors. One way of estimating error conditions on a 
longer term than one frame is to maintain a continually updated error parameter that 
represents the bit error ratio (BER) of the received signal. The receiver may also use 
the error parameter value to make other decisions concerning decoding and error 
15 concealment If the average error level is high, it may be more advantageous to 
process an uncorrectable error by muting a whole fi^e, whereas with a low 
average error level, muting one or a few subbands is a better solution. 



The invention is described in more detail with reference to the preferred embodi- 
ments presented by way of example and to the accompanying drawing wherein 



20 


Fig. 1 


shows a known encoder, 




Fig. 2 


shows a known decoder. 




Fig. 3 


shows a known digital audio frame, 




Fig. 4 


shows a known header in the frame according to Fig. 3, 




Fig. 5 


shows tabulation and windowing of audio frames according to the 


25 




invention, 




Fig. 6 


shows in the fonn of a flow diagram a detail of the method according to 
the invention, 




Fig. 7 


shows the order of actions in a stage of the method according to the 
invention, and 


30 


Fig. 8 


show^ the decoder according to the invention. 



Above in conjunction with the description of the prior art reference was made to 
Figs. 1 to 4, so below in the description of the invention and its preferred 
embodiments reference will be made mainly to Figs. 5 through 8. Like elements in 
35 the Figures are denoted by like reference designators. 



10 
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In the method according to the invention, the receiver uses the data contents of 
several successive frames to decode the frame being processed at a given time and 
to conceal the errors possibly detected in ,t. Fig. 5 shows a ring-like frame table 42 
The shape of the table as such is of no concrete significance because in a preferred 
embodiment of the invention it only exists as a certain number of computer memory 
locations but since so-called cyclic pointing is advantageously used for pointing to 
the one-frame blocks 42a in the table, it is illustrative to present the table in a ring- 
like form. Cyclic pointing means that a given block, having the address [kl is 

mppTc " *u ""^^^ " ^^^^'^^ « ^^^'^ NFRMS], where 

NFRMS .s the number of blocks in the table. The receiver according to the 
mvention mitially stores each received frame in the table 42 according to Fig 5 in 
the form which the frame has as it arrives at the input port of the decoder. 

Fig. 5 also shows a window 43 used by the decoder of the receiver to decode the 
frames and to detect and conceal the transmission errors possibly occurring in them 
Size of the wmdow is an integer number offices greater than zero. The index of 
the frame m the middle of the window, which identifies the W within the 
wmdow. is 0, and the frame is caUed the cmrent W. Those frames in the window 
that have been received and stored in the table 42 after the current frame are 
successor frames and the one farthest away from fte current fi^e is the front 
frame^ Those frames in the window that have been received and stored in the table 
42 before the current frame .are predecessor frames and the one farthest away from 
Ae current frame is the rear frame. The number of successor frames is marked cmixt 
(from cmrent number of next frames") and the number of predecessor frames is 
marked cnpre (from "current number of previous frames"). The values of cnnxt and 
cnpre can change dynamically in a manner which will be described in more detail 
later on. but they must satisfy the double inequality 0 ^ (cnpre+cnnxt) < NFRMS 
for the size of the window 43 in frames (= cnpre+cmixt+1) to be always at least i 
^d not more than NFRMS. If the size of the window 43 is one frame, the names 
front frame, rear frame, and current frame all mean one and the same frame. 

Fraines in the window 43 are indexed in a mam^er which is independent of frame 
bcaton m the t^ble 42. The index of the current frame is 0. as was stated above 
Ihe mdexes of successor frames are positive integers such that the index of the 
successor frame nearest to the cun-ent frame is 1. index of the next successor frame 
IS 2 and so dn; the index of the front frame is +cnnxt. 

Fig. 6 shows in the form of a flow diagram a program loop intended for converting 
the audio data carried by the current frame into PCM fomiat in as an eiror-free 
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manner as possible. In the description below it should be especially noted that the 
operations are directed alternately to the different frames and to understand the de- 
scription it is essential that the reader not mix up the frames with each other. The 
execution of the program loop starts in accordance witii Fig. 6 with the receiver 
5 starting in step 44 to decode the front frame and continuing doing so until the scale 
factors of the front frame have been decoded. After that, the receiver checks in step 
45 in a manner described later on whether there are transmission errors in the scale 
factors of the current frame and, if necessary, conceals them in step 46 using a 
method described later on. Then the receiver continues decoding the front fiume in 

10 accordance with step 47 until the subband samples in it have been dequantised but 
not yet scaled by multiplying tiiem by the scale factors included in the frame. Next, 
the receiver checks in step 48 in a manner described later on whether there are 
transmission errors in the subband samples of the current fi-ame and, if necessary, 
conceals them in step 49 using a method described later on. Then the receiver 

15 carries out in step 50 the scaling of samples of the current frame in a known 
manner and directs the scaled samples to inverse filtering where a PCM signal is 
generated and taken further to the output port of the decoder. Finally, the receiver 
moves in accordance with step 51 the window forward by one table block (i.e. takes 
a new frame as front frame, subtracts one from the indexes of all the frames that 

20 were in the window already and drops the rear frame from the window) and starts 
the decoding again with the new front frame in step 44. Decoding continues as long 
as the receiver is in operation and new frames are being received and stored in the 
table 42. 

The flow diagram in Fig. 6 does not imply that the method according to the 
25 invention could be carried out only as a series of temporally successive operations. 
If the receiver can perform several parallel processes simultaneously, the directing 
of a decoded current frame to inverse filtering and therefrom in PCM format to the 
decoder's output port can occur in parallel with the starting of a new decoding 
operation. Similarly, the storing of new frames in the table 42 outside the area 
30 covered by the window 43 and the removal of frames already dropped from the 
window 43 (in practice, the receiver overwrites the old frames in the memory with 
new ones) can 00010" at the same time that the frames in the window are being 
processed. 

Size of the window 43 may change during the operation of the receiver as long as 
35 the size-limiting nxunbers cnnxt and cnpre do not violate the condition given above 
in the form of the double inequality. The number of successor frames is directly 
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proportional to the decoding delay produced by the decoder. If for some reason it is 
desirable to increase the delay, the receiver can execute the program loop according 
to Fig. 6 in such a way that it leaves out die index subtraction operation according to 
step 51 until the desired delay is achieved. Then the current frame remains the same 
in each cycle and only a new front frame appears in the window which is one index 
further away from the current frame than the previous front frame (cnnxt increases). 
If it is desirable to decrease the -delay (cnnxt decreases), the receiver can in step 51 
subtract from the indexes of the frames in the window a number greater than 1 (to 
be precise, the number [l+(cnnxto]d-cnnxtnew)], where cnnxtoW is the value of 
cnnxt before decreasing the delay, and cnnxtnew is the value of cnnxt after decreas- 
ing the delay). Then the index of at least one frame jumps over zero. i.e. the frame 
in question never becomes the current frame. This may result in a parsing distortion 
in the auditory impression of the sound reproduced, even though inverse filtering 
generally tends to reduce the effect of such distortions. The receiver may also move 
15 the rear boundary (the boundaiy at die rear frame side) of the window 43 forward 
(cnpre decreases) or backward (cnpre increases). This has no effect on the decoder 
delay. 

Next it will be discussed how the receiver determines there is a transmission error in 
a frame. The ISO/IEC 11172-3 standard includes specifications for calculating a 
10 first CRC checksum concerning part of the audio frame header (cf reference 
designator 19 in Fig. 3). In addition, the DAB standard includes specifications for 
calculating a second CRC checksum concerning the frame scale factors (cf 
reference designator 26 in Fig. 3). Above it was discussed how the receiver uses 
checksums to detect errors. In the method according to the invention the receiver 
also verifies that certain frame elements contain values that axe allowed according to 
the DAB and ISO/IEC 11.172-3 and ISO/IEC 13818-3 standards. In the list below 
the checks are named as they appear in the standards in English. Some of die checks 
apply only to communications according to tiie DAB standard as the ISO/IEC 
11172-3 and ISO/IEC 13818-3 standards do not define equivalent data structures. 
These checks, however, do not violate the ISO/IEC 11172-3 or ISO/IEC 13818-3 
standard as they are directed to frame elements left unspecified in those standards. 

* SYNCWORD: if the value of flie synchronisation word is other than "1111 1111 
1 11 r, there is a transmission error in the frame. 

* LAYER: layer codes "Ol" and "11" are not allowed in DAB commmucations. so 
their appearance indicates an error. 
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* PROTECTION: in DAB communications, the protection bit has to be "0", so the 
value " 1 " indicates an error. 

* BIT RATE: according to the ISO/IEC II 172-3 and ISO/IEC 13818-3 standards, 
the value " 1 1 11 " is not allowed; furthermore, the value "0000" is not allowed in the 

5 DAB standard. 

* SAMPLING FREQUENCY: according to the DAB standard, the sampling 
frequency values "00" and "10" are not allowed. 

* PADDING BIT: if the sampling frequency is 48 kHz or 24 kHz, the padding 
indicator bit has to be "0", otherwise it is eironeous. 

10 ♦ MODE: sampling frequency, mode and bit rate combinations that are not included 
as allowed combinations in the table presented above in conjunction with the 
description of the prior art or infhe ISO/IEC 13818-3 standard, indicate an error. 

* EMPHASIS: according to the DAB standard, die value of the emphasis field has 
to be "00"; other values indicate an error. 

15 * BIT ALLOCATION: the total number of bits reserved for the subbands cannot 
exceed the space reserved for those bits in the frame. The total number of bits 
depends on the bit rate. A conflict between die bit rate and the total number of 
reserved bits indicates an error. 

* ID BIT CHANGE: if tiie ID bit is changed without die decoder knowing about the 
20 change beforehand, the receiver interprets the change as an error. 

* BIT RATE CHANGE: if the bit rate is changed without the decoder knowing 
about the change beforehand, the receiver interprets the change as an eiror. 

* SAMPLING FREQUENCY CHANGE: if die sampling frequency is changed 
without the decoder knowing about the change beforehand, the receiver interprets 

25 the change as an error. 

* MODE CHANGE: if the audio mode is changed without the decoder knowing 
about the change beforehand, the receiver interprets die change as an error; 
however, a change between the stereo mode and joint stereo mode in the one 
direction or the other is not interpreted as an error. 
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* LAYER CHANGE: if ihe layer is changed without the decoder knowing about the 
change beforehand, the receiver interprets the change as an error. 

* SCALE FACTOR INDEX: the scale factor index value "1 11 1 1] " is not allowed 
so its appearance indicates an error. 

* SUBBAND SAMPLE CODEWORD: if NLEVELS refers to the quantising levels 
of a given subband and it is 3, sample codewords greater than 26 (decimal) are 

xiT™ c^''^^'^^ ^^^^^ 124 (decimal) are illegal. If 

NLEVELS is 9, codewords greater than 728 (decimal) are illegal. Otherwise 
codewords comprising only ones are illegal. 

* PCM SAMPLE RANGE: there exist certain limits for the PCM signal generated at 
the mverse filtering. PCM pulses the absolute values of which exceed the maximum 
Imut mdicate an eiror. PCM pulses exceeding the limit are usually clipped to die 
maximum value before sound reproduction. 

Some of the aforementioned syntax eirors, or errors in which a value does not 
. belong to the fundamental set of allowed values specified for that particular field 
also result in an error detected by means of checksums. There are however' 
situations m which a syntax eiror does not have a net effect on the checksum 
syntax checks make the detection of transmission eirore more efficient 

Next it will be discussed the operation of the receiver in a situation in which it has 
detected a transmission error. Location of the eiror in the frame detemiines how 
severely it affects die decoding of the frame and the reproduction of the audio signal 
earned by the frame. If the error is in the area covered by the first checksum (error 
IS mdicated by calculation of the first checksmn or by any one of the checks BIT 
RATE, SAMPLING FREQUENCY. PADDING BIT. MODE. EMPHASIS BIT 
ALLOCATION, BIT RATE CHANGE, SAMPLING FREQUENCY CHANGE or 
MODE CHANGE) or if the check ID BIT CHANGE indicates the error, die whole 
frame has to be discarded. The second checksmn field for the scale factors has as 
described earlier, two or four checksums, each of which is directed to the scale 
factors of a certain subband group. If the calculation of any one of these cjiecksmns 
or the aforementioned SCALE FACTOR INDEX check indicates the error the 
receiver accordmg to the invention regards all scale factors in that particular grouo 
as unreliable. ^ ^ 

Discarding die frame means that the sample values transmitted by the frame have to 
be replaced by error-free or at least less erroneous values. Similarly, interpreting a 
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certain scale factor group unreliable means that those scale factors have to be 
replaced by better values. In the method according to the invention, better values are 
sought using the table and window arrangement described above as well as the 
operating procedure shovm in Fig. 7. The receiver looks for better values first in the 
5 predecessor frame closest to the current frame in step 52, If no better values are 
found there, the receiver next searches the successor frame closest to the current 
frame in step 53. The search continues alternately in predecessor and successor 
frames (steps 54 and 55) until the receiver either finds better values or has searched 
the whole window (steps 56 and 57). The latter case means that no better values can 
0 be obtained from any frame in the window and the error is thus uncorrectable and 
the erroneous values have to be replaced by zeroes. If the error was in the scale 
factors, the use of zeroes mutes the corresponding subbands for the current frame. If 
the error was in the area covered by the first checksum, the whole frame has to be 
muted. 

The error detection techniques described above, except for the ID BIT CHANGE 
check, are directed only to those parts of the frame that belong to the coverage area 
of the first or second checksum and/or for which there is a certain fimdamental set 
of allowed values. In audio fi^es according to the ISO/IEC 1 1 172-3 standard, the 
audio samples and all scale factors are unprotected. During transmission, errors may 
occur also in these parts of the frame, resulting in annoying distortion in the sound 
reproduced by the receiver. The present invention prepares also for errors occurring 
in the unprotected areas. In the solution according to the invention, the receiver 
continuously maintains an estimate of the mean bit error ratio (BER) of the received 
signal The estimate may be a single parameter the value of which increases in 
proportion to the number of errors detected by the receiver in the latest processed 
fi^es. In a more versatile alternative the BER estimate may be a record comprising 
several fields such as the number of eirors detected in N latest fi^es, where N is 
an integer; the time derivate of the bit error ratio, i.e. whether the ratio is increasing 
or decreasing; mutual ratios of successfiiUy concealed and uncorrected errors, etc. 

One way of using the BER estimate against errors occurring in the unprotected parts 
of the frames is e.g. such that if the BER estimate shows a generally high error 
level, the receiver will not allow sudden great changes in the values of scale factors 
or samples but interprets them as errors that should be concealed. But if the error 
level is generally low, the receiver will ialso reproduce sound elements conveyed by 
sudden changes. Furthermore, when the mean bit error ratio is high, it may be 
advantageous that even if the uncorrected error were in the scale factors, the 
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receiver mute the whole frame and not only the subbands associated with said scale 
factors. 

The method according to the invention can also make use of the fact that the 
receiver is usually arranged so as to clip PCM pulses the absolute values of which 
exceed a certain maximum value so that they then equal said maximum value In a 
preferred embodiment of the method according to the invention the receiver counts 
how often the PCM pulses need to be clipped. If one frame produces in excess of a 
given threshold value PCM pulses that need to be clipped, the receiver may assume 
that the frame m question contains too much noise and it must be muted by 
replacmg the PCM pulses with zero values. Said threshold value may depend on the 
BER estmiate in a manner such that the higher the mean eiror level the more 
readily the receiver assumes the frame erroneous, i.e. the lower said threshold value. 

Now it will be discussed a digital audio receiver decoder according to the invention, 
for which Fig. 8 shows a block diagram in accordance with a preferred embodiment 
The decoder 100 comprises, not unlike a decoder of the prior art, an input port 11 
output port 12, frame decoding block 13, data port 16 and an inverse filter bank is' 
The mterfaces of a reconstructing block 14 to the frame decoding block and inverse 
filter bank comply with the ISO/IEC 11172-3 or ISO/IEC 138181-3 Layer II 
standard. The block includes a memory 58 which forais a table 42 according to Fig 
5. In addition, the reconstructing block includes a read and write element 59 which 
wntes the new fi^es coming from the fi^e decoding block in the table reads a 
wmdowfid 43 of stored frames to be processed, and takes the decoded and scaled 
samples from each current frame to be directed to the inverse filter bank. In 
conjunction with the read and write element there is a bit error ratio computing 
block 60 which estimates the bit error ratio of the received signal and on the basis of 
that, controls the operation of the read and write element and, if necessary the 
replacement with zeroes of the PCM samples in comiection with the inverse 
filtenng. The latter is carried out as described above if in conjunction with the 
mverse filtenng it is detected too many exceedings of the maximmn allowed pulse 
limit with reject to the bit error ratio. 

hi the decoder accofding to the invention, the necessary fimctions related to the use 
of memory to tabulate the frames and to the control of memory read and write 
operations, error detection and conceaknent. are preferably realised as software 
processes executed by a microprocessor included in (he receiver. The drawing up of 
such software processes and their coding into instructions executable by the 
processor are as such known to one skilled in the art. 
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The invention provides an extensive and reliable method and equipment for 
detecting transmission errors in a digital audio signal and for concealing errors 
detected. Writing of frames to memory and reading them in parts determined by a 
window of a certain size are computationally not unreasonably demanding 
operations, so the invention is applicable to series production of digital audio 
receivers at a cost level required for consumer electronics. The exemplary 
embodiments described above do not confine the invention but it can be modified 
within the limits defined by the claims set forth below. 
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Claims 

1, 



A method for detecting and concealing errors in a digital audio receiver which 
processes coded digital audio signal in frames (17) of predetennined shape 
characterised in that it comprises steps wherein 

- several successive frames are stored (51) in memory (58). 

- one frame stored in memory is selected as the current frame (0), 

- the current frame is examined for errors, and 

- errors detected in the current frame are concealed using the contents of other 
stored frames (+1, +cnnxt, -1, -cnpre). 



2. The method of claim 1, characterised in that the latest received frame 
(+cnnxt) IS stored undecoded whereafter it is decoded in stages such that 

- m the first stage (44) a fu-st part of the frame (+cmixt) to be decoded is decoded 

- m the second stage (45) it is examined whether the part of the current frame '(0) 
15 that corresponds to said first part contains errors, 

- in the third stage (47) a second part of the frame (+cm«t) to be decoded is 
decoded, 

- in the fourth stage (48) it is examined whether the part of the current frame (0) that 
corresponds to said second part contains errors 

20 

3. The method of claim 2, characterised in that said frame (+cmixt) to be 
decoded is the same frame as said current frame (0). 

4. The method of claim 2, characterised in that said frame (+cmixt) to be 
25 decoded IS not the same frame as said current frame (0). 

5 



30 



35 



The method of claim 1, characterised in that it employs a certain read 
window (43) to read stored frames from memory, the size of the read window being 
a certain non-zero integer number of frames. 



6 



xTT^xIc" "^^"^""^ ^' characterised in that the size of said memory is 

NFRMS frames, where NFRMS is a- positive integer, and the size of said read 
wmdow is cmixt+cnpre+1 frames, where cmixt and cnpre satisfy the double 
mequality 0 ^ (cnpre+cmixt) < NFRMS, so that said read window contains cnpre 
frames that have been received before Ae current frame, and cnnxt frames that have 
been received after the current frame. 
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7. The method of claim 1, characterised in that said frames (17) are DAB audio 
frames according to the ETS 300 401 standard. 

8. The method of claim 7, characterised in that the latest received frame 
(+cnnxt) is stored imdecoded whereafter it is decoded in stages such that 

- in the first stage (44) the beginning of the frame (+cnnxt) to be decoded is decoded 
up to the scale factors, 

- in the second stage (45) it is examined whether the scale factors of the current 
frame (0) contain errors, 

- in the third stage (47) the part of the frame (+cnnxt) to be decoded that contains 
audio samples is dequantised into unsealed audio samples, and 

- in the fourth stage it is examined whether the unsealed audio samples in the 
current frame (0) contain errors. 

9. The method of claim 8, characterised in &at the audio samples of the current 
frame are scaled using the scale factors of the current frame after it has been 
examined whether the imscaled audio samples of the current frame contain errors 
and errors detected have been concealed. 

10. The method of claim 7, characterised in that the current frame is inteipreted 
wholly erroneous if any one of the following conditions is met: 

- the first checksum (19) foUov/ing the header of the frame is not in accord with the 
contents of its coverage area, 

- contents of the field (33) indicating bit rate are "0000" or " 1 1 1 1 

- contents of the field (34) indicating sampling frequency are "00" or "10", 

- value of padding indicator bit (35) is "1", 

- contents of the field (33) indicating bit rate are "0001", "0010", "0011" or "0101" 
while at the same time the ID bit (30) is "1" and the contents of the field (34) 
indicating sampling frequency are "01" and the contents of the field (37) indicating 
mode are "00", "01" or "10", 

- contents of the field (33) indicating bit rate are "1011 ", " 1 1 00", " 1 1 01 " or " 1 1 10" 
while at the same time the ID bit (30) is "1" and the contents of the field (34) 
indicating sampling frequency are "01" and the contents of the field' (37) indicating 
mode are "11", 

- contents of the field (41) indicating emphasis are "01", "10" or "1 1", 

- information conveyed by the field (33) indicating bit rate and the number of 
reserved bits contradict each other, 
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- the value of the field (33) indicating bit rate is different from that of the previous 
frame without the receiver having advance knowledge of the bit rate change, 

- the value of the field (34) indicating sampling frequency is different from that of 
the previous frame without the receiver having advance knowledge of the, sampling 

5 frequency change, 

- the value of the field (37) indicating mode is different from that of the previous 
frame without the receiver having advance knowledge of the mode change and the 
change does not indicate a transition between the "stereo" and "joint stereo" modes, 

- the value of the ID bit (30) is different from that of the previous frame without the 
receiver having advance knowledge of the ID bit change. 

11. The method of claim 10, characterised in that an attempt is made to replace " 
the sample values carried by a frame interpreted wholly erroneous with error-free 
substitute values from a frame which is temporally as close to the current frame as 
possible, and if no error-free substitute values are found closer than the distance 
equalling a predetermined number of frames, the sample values of the frame 
interpreted erroneous are replaced by zero values. 

12. The method of claim 7, characterised in that the current fi^e is interpreted 
partly erroneous if any one of the following conditions is met: 

- a checksum in the second checksum field (26) at the end part of the fiame is not in 
accord with the contents of its coverage area, 

- an index value indicating scale factor is " 11 II 11 ". 

13. The method of claim 12, characterised in that an attempt is made to replace 
the values interpreted erroneous in a frame inteipreted partly erroneous with error- 
free substitute values from a frame which is temporally as close to the current frame 
as possible, and if no error-fi^e substitute values are found closer than the distance 
equalling a predetennined number of frames, the sample values interpreted 
erroneous are replaced by zero values. 

14. A decoding apparatus for decoding a coded digital audio signal in frame 
format and for detecting and concealing errors in said digital audio signal, 
comprising an input (11) and output port (12) and between them, connected in 
series, 

- a frame decoding block (13) for preprocessing frames of a digital audio signal, 

- a reconstructing block (14) for performing the decoding process proper, and 



wo 98/13965 PCT/FI97/00581 

21 

- an inverse filtering block (15) for converting the decoded signal into a fonm 
directed to the output port, 

characterised in that said reconstructing block comprises 

- a table (58; 42) for the temporary storing of frames, 

5 - read and write means (59) for writing frames to said table and reading them from it 
in windows (43), 

- means for examining the correctness of a current frame (0) included in a window 
(43) read, and 

- means for replacing values detected erroneous in the current frame (0) using 
10 values obtained from other frames (+1, +cnnxt, -1, -cnpre) in the window. 

15. The decoding apparatus of claim 14, characterised in that it comprises means 
(15) for limiting a signal in the form directable to the output port such that it 
conforms to predetermined limit values. 

15 

16. The decoding apparatus of claim 15, characterised in that it further comprises 
means (60) for maintaining an estimate for a signal's bit error ratio and for 
controlling error concealment operation on the basis of the current estimate for the 
bit error ratio. 

20 

17. The decoding apparatus of claim 16, characterised in that it is arranged so as 
to mute a signal part in the form directable to the output port, obtained from a 
certain frame, if it as such would cause need in excess of a certain threshold value 
to limit the signal so as to conform to limit values, said threshold value depending 

25 on the current estimate for the bit error ratio. 
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